Working with compressed concordances

نویسندگان

  • Miri Kopel Ben-Nissan
  • Shmuel Tomi Klein
چکیده

A combination of new compression methods is suggested in order to compress the concordance of a large Information Retrieval system. The methods are aimed at allowing most of the processing directly on the compressed file, requesting decompression, if at all, only for small parts of the accessed data, saving I/O operations and CPU time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analyzing the Sense Distribution of Concordances Obtained by Web as Corpus Approach

In corpus-based lexicography and natural language processing fields some authors have proposed using the Internet as a source of corpora for obtaining concordances of words. Most techniques implemented with this method are based on information retrieval-oriented web searchers. However, rankings of concordances obtained by these search engines are not built according to linguistic criteria but t...

متن کامل

Conceptual Clustering of Korean Concordances Using Similarity between Morphemes

This paper describes a method for the conceptual clustering of Korean concordances. We present a method of computing conceptual similarity between concordances using the number of cooccurring morphemes and the similarities between morphemes. We use mutual information, the similarity between mutual information values and vector similarity to compute similarity between morphemes. When we try to c...

متن کامل

A Method To Reduce Large Number Of Concordances

In order to help to solve the problem of analysing large number of concordances of a given word 'W', the 'Dicc ionar io del Espa~ol de M~xico~ (DEM), has implemented a programme that i ) Reduces th is number, as to obtain the maximum possible informati.on with the minimum number of concordances to be handled. i i ) Sortes and rearranges the output so that s imi lar concordances are pr inted out...

متن کامل

Automatic Compilation Of Modern Chinese Concordances

This paper describes an experiment to compile Chinese concordances automatically. A very large volume of KWIC indexes for modern Chinese (one million lines per set) has been compiled successfully with a kanji printer for Japanese. This paper discusses the purposes of the experiment, selection and input of the Chinese data, some statistics on Chinese characters (vs. kanji) and the concordance co...

متن کامل

Prenatal development of monozygotic twins and concordance for schizophrenia.

While twin concordances for schizophrenia have been used to estimate heritability and to develop genetic models, concordances in subtypes of monozygotic (MZ) twins can also be used to investigate the influence of prenatal development in the etiology of mental illness. We used within-pair variability and mirroring of fingerprints to estimate retrospectively the placentation status of concordant ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006